-
Notifications
You must be signed in to change notification settings - Fork 162
Specify engine path for TRT evaluation #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a required Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant EvalScript as evaluate.py
participant TRTClient as TRTLocalClient
participant EngineBuilder as TensorRT Engine Builder
User->>EvalScript: Run with --engine_path=/path/to/model.engine
EvalScript->>TRTClient: ir_to_compiled(onnx_bytes, {engine_path})
TRTClient->>EngineBuilder: build_engine(onnx_bytes, engine_path=...)
rect rgba(200, 230, 255, 0.25)
note right of EngineBuilder: Resolve initial engine write location\n(use provided engine_path or default)
EngineBuilder-->>EngineBuilder: Build engine (TRT)
EngineBuilder-->>EngineBuilder: Read, hash, move to final_output_dir
end
EngineBuilder-->>TRTClient: (engine_bytes | None, logs)
TRTClient-->>EvalScript: compiled_engine / error
EvalScript-->>User: Evaluation results / logs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py (1)
234-241
: Keep user-provided engine_path intact; copy hashed artifact into output_dir instead of moving itMoving a user-specified engine file surprises callers and can break downstream code expecting that exact path to keep existing. Preserve the file at engine_path and write the hashed copy into final_output_dir. Also capture whether engine_path was provided before it’s shadowed.
@@ - with TemporaryDirectory() as tmp_dir: - onnx_path, engine_path, calib_cache_path, timing_cache_path, final_output_dir = ( - _setup_files_and_paths(Path(tmp_dir), engine_path) - ) + with TemporaryDirectory() as tmp_dir: + user_provided_engine_path = engine_path is not None + onnx_path, engine_path, calib_cache_path, timing_cache_path, final_output_dir = ( + _setup_files_and_paths(Path(tmp_dir), engine_path) + ) @@ - shutil.move(engine_path, engine_path_with_hash) + # If caller provided a custom engine_path, keep it intact and also persist a hashed copy. + if user_provided_engine_path and engine_path.resolve() != engine_path_with_hash.resolve(): + shutil.copy2(engine_path, engine_path_with_hash) + else: + shutil.move(engine_path, engine_path_with_hash) logging.info("Engine saved to %s", engine_path_with_hash)Also applies to: 223-227
modelopt/torch/_deploy/_runtime/trt_client.py (1)
86-95
: Guard compilation_args=None and fail fast on build failurecurrent code will AttributeError if compilation_args is None; also callers get None on failures despite return type bytes.
self.node_names = get_node_names_from_bytes(onnx_model_file_bytes) - engine_bytes, _ = build_engine( + compilation_args = compilation_args or {} + engine_bytes, _ = build_engine( onnx_bytes, dynamic_shapes=compilation_args.get("dynamic_shapes"), # type: ignore[union-attr] plugin_config=compilation_args.get("plugin_config"), # type: ignore[union-attr] engine_path=compilation_args.get("engine_path"), # type: ignore[union-attr] trt_mode=self.deployment["precision"], verbose=(self.deployment.get("verbose", "false").lower() == "true"), ) - self.engine_bytes = engine_bytes + if engine_bytes is None: + raise RuntimeError("TensorRT engine build failed; see trtexec logs for details.") + self.engine_bytes = engine_bytes return engine_bytes
🧹 Nitpick comments (3)
modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py (1)
137-147
: Clarify engine_path semantics in the docstringDoc says “Path to save the TensorRT engine,” but the code (and suggested change) preserves that path and also writes a hashed copy into output_dir. Make this explicit.
- engine_path: Path to save the TensorRT engine. + engine_path: Optional path where trtexec will initially write the engine (via --saveEngine). + The final, hashed artifact is stored under `output_dir` (or the default artifact dir) + as `<hash>-{model_name}.engine`. When a custom `engine_path` is provided, that file + is preserved and a hashed copy is saved in the artifact dir.modelopt/torch/_deploy/_runtime/trt_client.py (1)
77-79
: Doc: reference builder’s engine_path behaviorMake it clear this is forwarded to the builder and see builder docs for final file placement.
examples/onnx_ptq/evaluate.py (1)
38-43
: Clarify that --engine_path is an output path used during buildAvoid confusion with a “load existing engine” path.
- parser.add_argument( + parser.add_argument( "--engine_path", type=str, required=True, - help="Path to the TensorRT engine", + help="Output path where trtexec should write the engine (will also be copied to the artifact dir)", )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/onnx_ptq/evaluate.py
(2 hunks)modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py
(3 hunks)modelopt/torch/_deploy/_runtime/trt_client.py
(2 hunks)tests/examples/test_onnx_ptq.sh
(2 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
tests/examples/test_onnx_ptq.sh
[error] 177-177: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 177-177: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (3)
examples/onnx_ptq/evaluate.py (1)
92-93
: Confirmedir_to_compiled
supports a second argument—no changes needed.tests/examples/test_onnx_ptq.sh (2)
132-134
: LGTM: calibration_eps placementPassing --calibration_eps=cuda:0 works with CUDA_VISIBLE_DEVICES scoping.
165-174
: Engine/ONNX path selection per mode looks consistentMode-specific engine_path and eval_model_path wiring aligns with the new CLI.
Also applies to: 180-188
if [ "$quant_mode" == "fp16" ]; then | ||
eval_model_path=$model_dir/fp16/model.onnx | ||
engine_path=$model_dir/fp16/model.engine | ||
elif [ "$quant_mode" == "int8_iq" ]; then | ||
eval_model_path=$model_dir/fp16/model.onnx | ||
engine_path=$model_dir/int8_iq/model.engine | ||
else | ||
eval_model_path=$model_dir/$quant_mode/model.quant.onnx | ||
engine_path=$model_dir/$quant_mode/model.quant.engine | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix membership test to satisfy ShellCheck (SC2199, SC2076) and be robust
Use glob matching with ${array[*]}.
- if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then
+ # Using glob match on the space-joined array avoids SC2199/SC2076 pitfalls
+ if [[ " ${latency_models[*]} " == *" $model_name "* ]]; then
Also applies to: 177-177
🤖 Prompt for AI Agents
In tests/examples/test_onnx_ptq.sh around lines 165-174 (and also line 177), the
script uses direct string equality checks which trigger ShellCheck
SC2199/SC2076; replace those with a proper membership test using glob matching
on the joined array (e.g., [[ " ${array[*]} " == *" $quant_mode "* ]]). Define
the arrays of allowed modes where appropriate, then change the if/elif
conditions to use the [[ " ${array[*]} " == *" $quant_mode "* ]] pattern so
membership checks are robust and ShellCheck-compliant, keeping the same branches
for fp16, int8_iq and the fallback.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #313 +/- ##
=======================================
Coverage 73.82% 73.82%
=======================================
Files 172 172
Lines 17438 17439 +1
=======================================
+ Hits 12874 12875 +1
Misses 4564 4564 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: ajrasane <[email protected]>
Signed-off-by: ajrasane <[email protected]>
e98c325
to
959a173
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/examples/test_onnx_ptq.sh (1)
32-34
: Guard against missing/invalid GPU info (prevents divide-by-zero and numeric test errors).Handle empty/N/A outputs and zero GPUs before using modulo or numeric comparisons.
Apply this diff:
-nvidia_gpu_count=$(nvidia-smi --query-gpu=count --format=csv,noheader,nounits | head -n 1) -cuda_capability=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader | head -n 1 | tr -d .) +nvidia_gpu_count=$(nvidia-smi --query-gpu=count --format=csv,noheader,nounits 2>/dev/null | head -n 1) +cuda_capability_raw=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader 2>/dev/null | head -n 1) +if [[ -z "$nvidia_gpu_count" || "$nvidia_gpu_count" -eq 0 ]]; then + echo "ERROR: No NVIDIA GPUs detected (nvidia-smi count: ${nvidia_gpu_count:-unset})." >&2 + exit 1 +fi +if [[ -z "$cuda_capability_raw" || "$cuda_capability_raw" == "N/A" ]]; then + cuda_capability=0 +else + cuda_capability=${cuda_capability_raw/./} # e.g., 8.9 -> 89 +fi
♻️ Duplicate comments (1)
tests/examples/test_onnx_ptq.sh (1)
203-214
: Fix array membership test (ShellCheck SC2199/SC2076).Use glob match on the space-joined array; current regex-with-quotes is brittle and flagged by ShellCheck.
Apply this diff:
- if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then + # Use glob match on joined array to test membership (SC2199/SC2076) + if [[ " ${latency_models[*]} " == *" $model_name "* ]]; then CUDA_VISIBLE_DEVICES=$gpu_id python evaluate.py \ --onnx_path=$eval_model_path \ - --engine_path=$engine_path \ + --engine_path=$engine_path \ --model_name="${timm_model_name[$model_name]}" \ --engine_precision=$precision \ --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv &
🧹 Nitpick comments (7)
tests/examples/test_onnx_ptq.sh (7)
38-58
: Harden CLI argument parsing (shift inside for-loop is ineffective).Switch to a while/case parser so shifts take effect and unknown flags are surfaced.
Apply this diff:
-# Parse arguments -clean_mode=true -imagenet_path="" -models_folder="" - -for arg in "$@"; do - case $arg in - --no-clean) - clean_mode=false - shift - ;; - *) - if [ -z "$imagenet_path" ]; then - imagenet_path="$arg" - elif [ -z "$models_folder" ]; then - models_folder="$arg" - fi - shift - ;; - esac -done +# Parse arguments +clean_mode=true +imagenet_path="" +models_folder="" +while [[ $# -gt 0 ]]; do + case "$1" in + --no-clean) + clean_mode=false + shift + ;; + -*) + echo "Unknown option: $1" >&2 + exit 2 + ;; + *) + if [[ -z "$imagenet_path" ]]; then + imagenet_path="$1" + elif [[ -z "$models_folder" ]]; then + models_folder="$1" + else + echo "Unexpected positional arg: $1" >&2 + exit 2 + fi + shift + ;; + esac +done
81-86
: Make model discovery robust when no .onnx files exist.Enable nullglob to avoid iterating a literal pattern and fail fast with a clear message.
Apply this diff:
-declare -a model_paths -for model in "$models_folder"/*.onnx; do - model_paths+=("$model") -done +shopt -s nullglob +declare -a model_paths=("$models_folder"/*.onnx) +shopt -u nullglob +if [[ ${#model_paths[@]} -eq 0 ]]; then + echo "ERROR: No .onnx models found in: $models_folder" >&2 + exit 1 +fi
64-66
: Validate input directories early with helpful errors.Catch misconfigured imagenet/models paths before doing work.
Apply this diff:
imagenet_path=${imagenet_path:-/data/imagenet/} models_folder=${models_folder:-/models/onnx} +if [[ ! -d "$imagenet_path" ]]; then + echo "ERROR: imagenet_path not found: $imagenet_path" >&2 + exit 1 +fi +if [[ ! -d "$models_folder" ]]; then + echo "ERROR: models_folder not found: $models_folder" >&2 + exit 1 +fi
36-36
: Quote paths and add a trap to always unwind directory stack.Avoid word-splitting and ensure popd even on early exits.
Apply this diff:
-pushd $public_example_dir +pushd "$public_example_dir" +# Always return to original dir, even on errors +trap 'popd >/dev/null' EXITOptionally remove the explicit popd calls at Lines 246 and 250 since the trap handles it.
117-127
: Quote path expansions to handle spaces safely.Low-risk now, but quoting prevents subtle bugs.
Apply this diff:
- model_dir=build/$model_name + model_dir="build/$model_name" @@ - mkdir -p $model_dir/$mode + mkdir -p "$model_dir/$mode" @@ - cp $model_path $model_dir/fp16/model.onnx + cp "$model_path" "$model_dir/fp16/model.onnx"
151-157
: Quote CLI args and ensure consistent style.Keep style consistent (either use --arg=value everywhere or split args uniformly) and quote paths.
Apply this diff:
- CUDA_VISIBLE_DEVICES=$gpu_id python -m modelopt.onnx.quantization \ - --onnx_path=$model_dir/fp16/model.onnx \ - --quantize_mode=$quant_mode \ - --calibration_data=$calib_data_path \ - --output_path=$model_dir/$quant_mode/model.quant.onnx \ - --calibration_eps=cuda:0 & + CUDA_VISIBLE_DEVICES=$gpu_id python -m modelopt.onnx.quantization \ + --onnx_path="$model_dir/fp16/model.onnx" \ + --quantize_mode="$quant_mode" \ + --calibration_data="$calib_data_path" \ + --output_path="$model_dir/$quant_mode/model.quant.onnx" \ + --calibration_eps="cuda:0" &Apply similar quoting for evaluate.py invocations.
240-248
: Optional: gate cleanup symmetry via flag only (keep initial artifacts when troubleshooting).Current flow always rm -rf build at start; --no-clean only preserves end artifacts. If you'd like to keep prior runs for diffing, consider skipping the initial rm when --no-clean is set.
Example tweak:
-rm -rf build -mkdir -p build +if [[ "$clean_mode" == true ]]; then + rm -rf build +fi +mkdir -p build
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/onnx_ptq/evaluate.py
(2 hunks)modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py
(3 hunks)modelopt/torch/_deploy/_runtime/trt_client.py
(2 hunks)tests/examples/test_onnx_ptq.sh
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- modelopt/torch/_deploy/_runtime/trt_client.py
- modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py
- examples/onnx_ptq/evaluate.py
🧰 Additional context used
🪛 Shellcheck (0.10.0)
tests/examples/test_onnx_ptq.sh
[error] 203-203: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 203-203: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: build-docs
- GitHub Check: code-quality
🔇 Additional comments (1)
tests/examples/test_onnx_ptq.sh (1)
188-200
: Confirmed — evaluate.py accepts the engine_precision values used by the test.examples/onnx_ptq/evaluate.py defines --engine_precision as type=str (default "stronglyTyped") and sets deployment["precision"] = args.engine_precision, so "fp16", "best" (int8_iq), and "stronglyTyped" are accepted as-is.
What does this PR do?
Type of change: Bug fix
Overview:
Testing
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
New Features
Tests
Documentation